{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Chapter 7 Solutions"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Problem 7.1\n",
    "\n",
    "The first and the second derivatives of function 'f(x)' are given by\n",
    "\n",
    "$\n",
    "f'(x) = 3x^2 + 12 x -3,\\\\\n",
    "f''(x) = 6x\n",
    "$\n",
    "\n",
    "To find the positions of maxima and minima we solve equation for the points where the first derivative takes zero values:\n",
    "\n",
    "$\n",
    "3x^2 + 12 x -3 = 0 \\Rightarrow x_{1,2} = -2 \\pm \\sqrt{5}\n",
    "$\n",
    "\n",
    "Substituting each of the roots into the second derivative we obtain that\n",
    "\n",
    "$\n",
    "f''(x_1) = f''(-2 + \\sqrt{5}) > 0,\\\\\n",
    "f''(x_2) = f''(-2 - \\sqrt{5}) < 0,\n",
    "$\n",
    "\n",
    "i.e. $x_1 = -2 + \\sqrt{5}$ is the minimum, whereas $x_2 = -2 - \\sqrt{5}$ is the maximum of function $f(x)$. \n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Problem 7.2\n",
    "\n",
    "The full batch update rule for gradient descent is\n",
    "\n",
    "$\n",
    "\\mathbf{\\theta}_{i+1} = \\mathbf{\\theta}_i - \\gamma_i\\sum_{n=1}^{N}\\left(\\nabla L_n(\\mathbf{\\theta}_i)\\right)^T\n",
    "$\n",
    "\n",
    "For a mini-batch of size $1$ at every step we would choose only one of the data points, i.e. we would choose randomly $n$ among the values $1...N$ and calculate the update as\n",
    "\n",
    "$\n",
    "\\mathbf{\\theta}_{i+1} = \\mathbf{\\theta}_i - \\gamma_i\\left(\\nabla L_n(\\mathbf{\\theta}_i)\\right)^T\n",
    "$\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Problem 7.3\n",
    "\n",
    "### a)\n",
    "True. Indeed, any two points in the intersection of the two convex sets can be connected by a path, which belongs to each of the sets (since they are convext), and therefore is also a part of the intersection.\n",
    "\n",
    "### b)\n",
    "False. Indeed, if the two sets are completely disjoint, their union may not contain the points joining a point of one set to a point of the other.\n",
    "\n",
    "### c)\n",
    "False. Subtracting one set from the other may remove some of the points that belong to teh paths connecting the points of the remaining set.\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Problem 7.4\n",
    "\n",
    "### a) \n",
    "True. If $f(\\mathbf{x})$ and $g(\\mathbf{x})$ are convex functions, it means that for any $\\theta \\in[0,1]$:\n",
    "\n",
    "$\n",
    "f(\\theta \\mathbf{x} + (1-\\theta)\\mathbf{y}) \\leq \\theta f(\\mathbf{x}) + (1-\\theta)f(\\mathbf{y}),\\\\\n",
    "g(\\theta \\mathbf{x} + (1-\\theta)\\mathbf{y}) \\leq \\theta g(\\mathbf{x}) + (1-\\theta)g(\\mathbf{y}).\n",
    "$\n",
    "\n",
    "By adding these inequalities we immediately obtain the condition of convexity for their sum, $h(\\mathbf{x}) = f(\\mathbf{x}) + g(\\mathbf{x})$:\n",
    "\n",
    "$\n",
    "h(\\theta \\mathbf{x} + (1-\\theta)\\mathbf{y})=\n",
    "f(\\theta \\mathbf{x} + (1-\\theta)\\mathbf{y}) + g(\\theta \\mathbf{x} + (1-\\theta)\\mathbf{y})\n",
    "\\leq\n",
    "\\theta f(\\mathbf{x}) + (1-\\theta)f(\\mathbf{y}) + \\theta g(\\mathbf{x}) + (1-\\theta)g(\\mathbf{y})=\n",
    "\\theta h(\\mathbf{x}) + (1-\\theta)h(\\mathbf{y})\n",
    "$\n",
    "\n",
    "### b)\n",
    "False. For example; the difference of $f(x) = ax^2$ and $g(x) = bx^2$ is convex if $a \\geq b$, but concave otherwise\n",
    "\n",
    "### c)\n",
    "False. For example the product of two convex functions $f(x) = (x-a)^2$ and $g(x) = (x+a)^2$, $h(x) = (x-a)^2(x+a)^2$ has two minima and a maximum at $x=0$. For examlple, if we take $x=a$, $y=-a$ and $\\theta = 0.5$, then\n",
    "\n",
    "$\n",
    "h(\\theta x + (1-\\theta)y) = h(0) = a^4 > 0 = 0.5 h(a) + 0.5 h(-a) = \\theta h(x) + (1-\\theta)h(y),\n",
    "$\n",
    "\n",
    "i.e. the condition fo convexity is not satisfied.\n",
    "\n",
    "### d)\n",
    "The maximum of a function is not a function, so it does not have property of convexity. If however we talk about the maximum as a convext set, consisting of only one point, then it is trivially convex.  \n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Problem 7.5\n",
    "\n",
    "We first introduce vectors $\\mathbf{y} = (x_0, x_1, \\xi)^T$ and $\\mathbf{c} = (p_0, p_1, 1)^T$, \n",
    "so that $\\mathbf{p}^T\\mathbf{x} + \\xi = \\mathbf{c}^T\\mathbf{y}$.\n",
    "\n",
    "We then introduce vector $\\mathbf{b} = (0, 3, 0)$ and matrix\n",
    "\n",
    "$\n",
    "\\mathbf{A} =\n",
    "\\begin{bmatrix}\n",
    "1 & 0 & 0\\\\\n",
    "0 & 1 & 0\\\\\n",
    "0 & 0 & -1\n",
    "\\end{bmatrix}\n",
    "$\n",
    "\n",
    "which allow us to write the three constraints as $\\mathbf{A}\\mathbf{y} \\leq \\mathbf{b}$.\n",
    "\n",
    "We now can write the problem as a standard linear program:\n",
    "\n",
    "$\n",
    "\\max_{\\mathbf{y} \\in \\mathbb{R}^3} \\mathbf{c}^T\\mathbf{y}\\\\\n",
    "\\text{subject to    } \\mathbf{A}\\mathbf{y} \\leq \\mathbf{b}\n",
    "$\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Problem 7.6\n",
    "\n",
    "Let us define $\\mathbf{c} = (-5, -3)^T$, $\\mathbf{b} = (33, 8, 5, -1, 8)^T$ and \n",
    "\n",
    "$\n",
    "\\mathbf{A}=\n",
    "\\begin{bmatrix}\n",
    "2 & 2\\\\\n",
    "2 & -4\\\\-2 & 1\\\\\n",
    "0 & -1\\\\\n",
    "0 & 1\n",
    "\\end{bmatrix}\n",
    "$\n",
    "\n",
    "The linear program is then written as\n",
    "\n",
    "$\n",
    "\\min_{\\mathbf{x} \\in \\mathbb{R}^2} \\mathbf{c}^T\\mathbf{x}\\\\\n",
    "\\text{subject to    } \\mathbf{A}\\mathbf{x} \\leq \\mathbf{b}\n",
    "$\n",
    "\n",
    "The Lagrangian of this problem is\n",
    "\n",
    "$\n",
    "\\mathcal{L}(\\mathbf{x},\\mathbf{\\lambda}) = \n",
    "\\mathbf{c}^T\\mathbf{x} + \\mathbf{\\lambda}^T(\\mathbf{A}\\mathbf{x} - \\mathbf{b})=\n",
    "(\\mathbf{c}^T\\mathbf{x} + \\mathbf{\\lambda}^T\\mathbf{A})\\mathbf{x} - \\mathbf{\\lambda}^T\\mathbf{b}=\n",
    "(\\mathbf{c}\\mathbf{x} + \\mathbf{A}^T\\mathbf{\\lambda})^T\\mathbf{x} - \\mathbf{\\lambda}^T\\mathbf{b}\n",
    "$\n",
    "\n",
    "Taking gradient in respect to $\\mathbf{x}$ and setting it to zero we obtain the extremum condition\n",
    "\n",
    "$\n",
    "\\mathbf{c}\\mathbf{x} + \\mathbf{A}^T\\mathbf{\\lambda} = 0,\n",
    "$\n",
    "\n",
    "that is\n",
    "\n",
    "$\n",
    "\\mathcal{D}(\\mathbf{\\lambda}) = \\min_{\\mathbf{x} \\in \\mathbb{R}^2} \\mathcal{L}(\\mathbf{x},\\mathbf{\\lambda}) = - \\mathbf{\\lambda}^T\\mathbf{b}\n",
    "$\n",
    "\n",
    "that is the dual problem is given by\n",
    "\n",
    "$\n",
    "\\max_{\\mathbf{\\lambda} \\in \\mathbb{R}^5} - \\mathbf{b}^T\\mathbf{\\lambda}\\\\\n",
    "\\text{subject to    } \\mathbf{c}\\mathbf{x} + \\mathbf{A}^T\\mathbf{\\lambda} = 0 \\text{    and    } \\mathbf{\\lambda} \\geq 0\n",
    "$\n",
    "\n",
    "In terms of the original values of the parameters it can be thus written as\n",
    "\n",
    "$\n",
    "\\max_{\\mathbf{\\lambda} \\in \\mathbb{R}^5} - \n",
    "\\begin{bmatrix}\n",
    "33 \\\\ 8 \\\\ 5 \\\\ -1 \\\\ 8\n",
    "\\end{bmatrix}^T\n",
    "\\begin{bmatrix}\n",
    "\\lambda_1 \\\\ \\lambda_2 \\\\ \\lambda_3 \\\\ \\lambda_4 \\\\ \\lambda_5\n",
    "\\end{bmatrix}\n",
    "\\\\\n",
    "\\text{subject to    } -\n",
    "\\begin{bmatrix}\n",
    "5 \\\\ 3\n",
    "\\end{bmatrix}+\n",
    "\\begin{bmatrix}\n",
    "2 & 2 & -2 & 0 & 0\\\\\n",
    "2 & -4 & 1 & -1 & 1\n",
    "\\end{bmatrix}\n",
    "\\begin{bmatrix}\n",
    "\\lambda_1 \\\\ \\lambda_2 \\\\ \\lambda_3 \\\\ \\lambda_4 \\\\ \\lambda_5\n",
    "\\end{bmatrix} = 0\\\\\n",
    "\\text{and    } \n",
    "\\begin{bmatrix}\n",
    "\\lambda_1 \\\\ \\lambda_2 \\\\ \\lambda_3 \\\\ \\lambda_4 \\\\ \\lambda_5\n",
    "\\end{bmatrix} \\geq 0\n",
    "$\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Problem 7.7\n",
    "\n",
    "We introduce $\\mathbf{Q} = \\begin{bmatrix} 2 & 1 \\\\ 1 & 4 \\end{bmatrix}$, $\\mathbf{c} = \\begin{bmatrix} 5\\\\3\\end{bmatrix}$, \n",
    "$\n",
    "\\mathbf{A} = \\begin{bmatrix}\n",
    "1 & 0\\\\-1 & 0\\\\\n",
    "0 & 1\\\\\n",
    "0 & -1\n",
    "\\end{bmatrix}\n",
    "$\n",
    "and\n",
    "$\n",
    "\\mathbf{b}=\n",
    "\\begin{bmatrix}\n",
    "1\\\\1\\\\1\\\\1\n",
    "\\end{bmatrix}\n",
    "$\n",
    "\n",
    "Then the quadratic problem takes form\n",
    "\n",
    "$\n",
    "\\min_{\\mathbf{x} \\in \\mathbb{R}^2} \\frac{1}{2}\\mathbf{x}^T\\mathbf{Q}\\mathbf{x} + \\mathbf{c}^T\\mathbf{x}\\\\\n",
    "\\text{subject to     }\n",
    "\\mathbf{A}\\mathbf{x} \\leq \\mathbf{b}\n",
    "$\n",
    "\n",
    "The Lagrangian corresponding to this problem is\n",
    "\n",
    "$\n",
    "\\mathcal{L}(\\mathbf{x},\\mathbf{\\lambda}) =\n",
    "\\frac{1}{2}\\mathbf{x}^T\\mathbf{Q}\\mathbf{x} + \\mathbf{c}^T\\mathbf{x} + \\mathbf{\\lambda}^T(\\mathbf{A}\\mathbf{x} - \\mathbf{b})=\n",
    "\\frac{1}{2}\\mathbf{x}^T\\mathbf{Q}\\mathbf{x} + (\\mathbf{c}^T+\\mathbf{A}^T\\mathbf{\\lambda})^T\\mathbf{x} - \\mathbf{\\lambda}^T\\mathbf{b}\n",
    "$\n",
    "\n",
    "where $\\mathbf{\\lambda} = \\begin{bmatrix}\\lambda_1\\\\\\lambda_2\\\\\\lambda_3\\\\\\lambda_4\\end{bmatrix}$\n",
    "We minimize the Lagrangian by setting its gradient to zero, which results in\n",
    "\n",
    "$\n",
    "\\mathbf{Q}\\mathbf{x} + \\mathbf{c}+\\mathbf{A}^T\\mathbf{\\lambda} = 0\n",
    "\\Rightarrow \n",
    "\\mathbf{x} = -\\mathbf{Q}^{-1}(\\mathbf{c}+\\mathbf{A}^T\\mathbf{\\lambda})\n",
    "$\n",
    "\n",
    "Substituting this back into the Lagrangian we obtain\n",
    "\n",
    "$\n",
    "\\mathcal{D}(\\mathbf{\\lambda})=\n",
    "\\min_{\\mathbf{x} \\in \\mathbb{R}^2} \\mathcal{L}(\\mathbf{x},\\mathbf{\\lambda}) =-(\\mathbf{c}+\\mathbf{A}^T\\mathbf{\\lambda})^T\\mathbf{Q}^{-1}(\\mathbf{c}+\\mathbf{A}^T\\mathbf{\\lambda})- \\mathbf{\\lambda}^T\\mathbf{b}\n",
    "$\n",
    "\n",
    "The dual problem is now\n",
    "\n",
    "$\n",
    "\\max_{\\mathbf{\\lambda} \\in \\mathbb{R}^4} -(\\mathbf{c}+\\mathbf{A}^T\\mathbf{\\lambda})^T\\mathbf{Q}^{-1}(\\mathbf{c}+\\mathbf{A}^T\\mathbf{\\lambda})- \\mathbf{\\lambda}^T\\mathbf{b}\\\\\n",
    "\\text{subject to    }\n",
    "\\mathbf{\\lambda} \\geq 0,\n",
    "$\n",
    "where the parameter vectors and matrices are defined above.\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Problem 7.8\n",
    "\n",
    "The primal problem can be written in teh standard form as\n",
    "\n",
    "$\n",
    "\\min_{\\mathbf{w}\\in\\mathbb{R}^D} \\frac{1}{2}\\mathbf{w}^T\\mathbf{w}\\\\\n",
    "\\text{subject to   } 1-\\mathbf{x}^T\\mathbf{w} \\leq 0 \n",
    "$\n",
    "\n",
    "The Lagrangian is then\n",
    "\n",
    "$\n",
    "\\mathcal{L}(\\mathbf{w},\\lambda)= \n",
    "\\frac{1}{2}\\mathbf{w}^T\\mathbf{w} + \\lambda(1-\\mathbf{x}^T\\mathbf{w})\n",
    "$\n",
    "\n",
    "Taking gradient in respect to $\\mathbf{w}$ we obtain the position of the minimum:\n",
    "$\n",
    "\\mathbf{w} = \\lambda \\mathbf{x}\n",
    "$\n",
    "\n",
    "Thus, the dual Lagrangian is\n",
    "\n",
    "$\\mathcal{D}(\\lambda)= \n",
    "\\min_{\\mathbf{w}\\in\\mathbb{R}^D}\\mathcal{L}(\\mathbf{w},\\lambda)=-\\frac{\\lambda^2}{2}\\mathbf{x}^T\\mathbf{x} +\\lambda\n",
    "$\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Problem 7.9\n",
    "\n",
    "Assuming standard dot product, convex conjugate is defined as\n",
    "\n",
    "$\n",
    "f^*(\\mathbf{s}) = \n",
    "\\sup_{\\mathbf{x}\\in\\mathbb{R}^D} \\mathbf{s}^T\\mathbf{x} - f(\\mathbf{x})\n",
    "$\n",
    "\n",
    "Since function $f(\\mathbf{x})$ is continuous, differentiable and convex, looking for the supremum is equivalent to looking for the maximum, and can be done by setting the following gradient to zero:\n",
    "\n",
    "$\n",
    "\\nabla_{\\mathbf{x}}\\left(\\mathbf{s}^T\\mathbf{x} - f(\\mathbf{x})\\right) = \\mathbf{s}^T - \\nabla_\\mathbf{x}f(\\mathbf{x})=0\n",
    "$\n",
    "\n",
    "that is\n",
    "$\n",
    "s_d = \\frac{\\partial}{\\partial x_d}f(\\mathbf{x}) = \\log x_d - 1 \\Rightarrow x_d = e^{s_d + 1}\n",
    "$\n",
    "\n",
    "Then\n",
    "\n",
    "$\n",
    "f^*(\\mathbf{s})=\n",
    "\\sum_{d=1}^Ds_dx_d - \\sum_{d=1}^Dx_d\\log x_d=\n",
    "\\sum_{d=1}^Ds_d e^{s_d+1} - \\sum_{d=1}^De^{s_d+1}(s_d + 1)=-\\sum_{d=1}^De^{s_d+1}\n",
    "$\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Problem 7.10\n",
    "\n",
    "Without loss of generality we can assume that $\\mathbf{A} = \\mathbf{A}^T$ is a symmetric matrix, as well as its inverse. By setting to zero gradient\n",
    "\n",
    "$\n",
    "\\nabla_{\\mathbf{x}}\\left(\\mathbf{s}^T\\mathbf{x} - f(\\mathbf{x})\\right)=\n",
    "\\mathbf{s}^T - \\mathbf{x}^T\\mathbf{A} - \\mathbf{b}=\n",
    "0\n",
    "$\n",
    "\n",
    "that is \n",
    "\n",
    "$\\mathbf{s} = \\mathbf{A}\\mathbf{x} + \\mathbf{b} \\Leftrightarrow \\mathbf{x} = \\mathbf{A}^{-1}(\\mathbf{s} - \\mathbf{b})$\n",
    "\n",
    "Therefore\n",
    "\n",
    "$\n",
    "f^*(\\mathbf{s}) =\n",
    "\\frac{1}{2}(\\mathbf{s} - \\mathbf{b})\\mathbf{A}^{-1}(\\mathbf{s} - \\mathbf{b}) - c\n",
    "$\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Problem 7.11\n",
    "\n",
    "By definition of convex conjugate:\n",
    "\n",
    "$\n",
    "L^*(\\beta) = \\sup_{\\alpha \\in \\mathbb{R}} \\beta\\alpha - \\max\\{0, 1-\\alpha\\}=\n",
    "\\sup_{\\alpha \\in \\mathbb{R}}\n",
    "\\begin{cases}\n",
    "\\beta\\alpha, \\text{ if } \\alpha > 1\\\\\n",
    "\\beta\\alpha -1 + \\alpha, \\text{ if } \\alpha< 1\n",
    "\\end{cases}=\n",
    "\\sup_{\\alpha \\in \\mathbb{R}}\n",
    "\\begin{cases}\n",
    "\\beta\\alpha, \\text{ if } \\alpha > 1\\\\\n",
    "(\\beta+1)\\alpha -1, \\text{ if } \\alpha< 1\n",
    "\\end{cases}\n",
    "$\n",
    "\n",
    "We must here distinguish three cases:\n",
    "\n",
    "a) if $\\beta>0$, then $\\beta\\alpha$ and $(\\beta + 1)\\alpha -1$ are both increasing with $\\alpha$, and take their maximum values at the maximum posible value of $\\alpha$, that is\n",
    "\n",
    "$\n",
    "L^*(\\beta) = \n",
    "\\sup_{\\alpha \\in \\mathbb{R}}\n",
    "\\begin{cases}+\\infty, \\text{ if } \\alpha > 1\\\\\n",
    "\\beta, \\text{ if } \\alpha< 1\n",
    "\\end{cases}=+\\infty\n",
    "$\n",
    "\n",
    "b) if $\\beta < -1$ then then $\\beta\\alpha$ and $(\\beta + 1)\\alpha -1$ are both decreasing with $\\alpha$, and take their maximum values at the minimum possible value of $\\alpha$, that is\n",
    "\n",
    "$\n",
    "L^*(\\beta) = \n",
    "\\sup_{\\alpha \\in \\mathbb{R}}\n",
    "\\begin{cases}\n",
    "\\beta, \\text{ if } \\alpha > 1\\\\+\\infty, \\text{ if } \\alpha< 1\n",
    "\\end{cases}=+\\infty\n",
    "$\n",
    "\n",
    "c) Finally, if $-1 \\leq \\beta \\leq 0$, then $\\beta\\alpha$ is decreasing with $\\alpha$, whereas $(\\beta + 1)\\alpha -1$ is growing, i.e.\n",
    "\n",
    "$\n",
    "L^*(\\beta) = \n",
    "\\sup_{\\alpha \\in \\mathbb{R}}\n",
    "\\begin{cases}\n",
    "\\beta, \\text{ if } \\alpha > 1\\\\\n",
    "\\beta, \\text{ if } \\alpha< 1\n",
    "\\end{cases}=\n",
    "\\beta\n",
    "$\n",
    "\n",
    "We thus obtain\n",
    "\n",
    "$\n",
    "L^*(\\beta) =\n",
    "\\begin{cases}\n",
    "\\beta \\text{ if } -1\\leq \\beta \\leq 0\\\\+\\infty \\text{ otherwise}\n",
    "\\end{cases}\n",
    "$\n",
    "\n",
    "Let us now define function $M(\\alpha) = L^*(\\beta) + \\frac{\\gamma}{2}\\beta^2$ and calculate its convex conjugate function (we assume that $\\gamma> 0$):\n",
    "\n",
    "$\n",
    "M^*(\\alpha)=\n",
    "\\sup_{\\beta \\in \\mathbb{R}} \\alpha\\beta - L^*(\\beta) - \\frac{\\gamma}{2}\\beta^2=\n",
    "\\sup_{\\beta \\in [-1, 0]} \\alpha\\beta - L^*(\\beta) - \\frac{\\gamma}{2}\\beta^2=\n",
    "\\sup_{\\beta \\in [-1, 0]} \\alpha\\beta - \\beta - \\frac{\\gamma}{2}\\beta^2,\n",
    "$\n",
    "\n",
    "where the second equality is because outside of interval $[-1,0]$ we have $-L^*(\\beta) = -\\infty$.\n",
    "\n",
    "In interval $[-1,0]$ parabolic function $\\alpha\\beta - \\beta - \\frac{\\gamma}{2}\\beta^2 = - \\frac{\\gamma}{2}\\left(\\beta-\\frac{\\alpha-1}{\\gamma}\\right)^2 +\\frac{(1-\\alpha)^2}{2\\gamma}$ has a maximim at $\\beta = \\frac{\\alpha-1}{\\gamma}$, if this point is inside this interval, and at one of the interval edges otherwise. More precisely\n",
    "\n",
    "$\n",
    "M^(\\alpha)=\n",
    "\\begin{cases}\n",
    "0, \\text{ if } \\alpha > 1\\\\\n",
    "\\frac{(1-\\alpha)^2}{2\\gamma}, \\text{ if } 0\\leq\\alpha\\leq 1\\\\\n",
    "1-\\alpha-\\frac{\\gamma}{2}, \\text{ if } \\alpha > 1\n",
    "\\end{cases}\n",
    "$\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
